SEQUENCE LISTING 



(1) GENERAL INFOI 



PION: 



(i) APPLICANT: 



}e Robertis, Edward 
>uwmeester, Tewis 



(ii) TITLE OF INVENTION: Endoderm, Cardiac and Neural Inducing 
Factors 




(iii) NUMBER OF SEQUENCES : 10 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Majestic, Parsons, Siebert & Hsue 

(B) STREET: Four ^mbarcadero Center, Suite 1100 

(C) CITY: San Francisco 

(D) STATE: California 

(E) COUNTRY: U.S.AA 

(F) ZIP: 94111-4106> 

(V) COMPUTER READABLE FOR^ 

(A) MEDIUM TYPE: Flopby disk 

(B) COMPUTER: IBM PC Compatible 

(C) OPERATING SYSTEM: KC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 18-JUN-l! 

(C) CLASSIFICATION: 



08/878,474 

17 



(Vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 6^/020,150 

(B) FILING DATE: 20-JUN-1996 

(Viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Siebert, J. Suzanne 

(B) REGISTRATION NUMBER: 28,758\ 

(C) REFERENCE /DOCKET NUMBER: 3 1 C\0 . 0 02US 1 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 415/248-5500 

(B) TELEFAX: 415/362-5418 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 270 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

Met Leu Leu Asn Val Leu Arg lie Cys lie lie Val Cys Leu Val Asn 
15 10 15 

Asp Gly Ala Gly Lys His Ser Glu Gly Arg Glu Arg Thr Lys Thr Tyr 
20 25 30 

Ser Leu Asn Ser Arg Gly Tyr Phe Arg Lys Glu Arg Gly Ala Arg Arg 
35 40 45 

ia Ser Lys lie Leu Leu Val Asn Thr Lys Gly Leu Asp Glu Pro His lie 
i 50 55 60 

□ 

I** Gly His Gly Asp Phe Gly Leu Val Ala Glu Leu Phe Asp Ser Thr Arg 
\2 65 70 75 80 

--J 

iJl Thr His Thr Asn Arg Lys Glu Pro Asp Met Asn Lys Val Lys Leu Phe 

85 90 95 

H Ser Thr Val Ala His Gly Asn Lys Ser Ala Arg Arg Lys Ala Tyr Asn 

rf ioo 105 no 

^ Gly Ser Arg Arg Asn lie Phe Ser Arg Arg Ser Phe Asp Lys Arg Asn 
U 115 120 125 

f_£, 

Thr Glu Val Thr Glu Lys Pro Gly Ala Lys Met Phe Trp Asn Asn Phe 
130 135 140 

Leu Val Lys Met Asn Gly Ala Pro Gin Asn Thr Ser His Gly Ser Lys 
145 150 155 160 

Ala Gin Glu lie Met Lys Glu Ala Cys Lys Thr Leu Pro Phe Thr Gin 

165 170 175 

Asn He Val His Glu Asn Cys Asp Arg Met Val He Gin Asn Asn Leu 
180 185 190 

Cys Phe Gly Lys Cys He Ser Leu His Val Pro Asn Gin Gin Asp Arg 
195 200 205 

Arg Asn Thr Cys Ser His Cys Leu Pro Ser Lys Phe Thr Leu Asn His 
210 215 220 



[Pag 2] 



Leu Thr Leu Asn Cys Thr Gly Ser Lys Asn Val Val Lys Val Val Met 
225 230 235 240 

Met Val Glu Glu Cys Thr Cys Glu Ala His Lys Ser Asn Phe His Gin 

245 250 255 

Thr Ala Gin Phe Asn Met Asp Thr Ser Thr Thr Leu His His 
260 265 270 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1411 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



.=1 


(ii) MOLECULE TYPE: CDNA 












(xi) SEQUENCE DESCRIPTION: 


SEQ ID NO: 2 


• 






!jj 


GAATTCCTAA AAGCGGCACA GTGCAGGAAC 


AGCAAGTCGC 


TCAGAAACAC 


TGCAGGGTCT 


60 




AGATATCATA CAATGTTACT AAATGTACTC 


AGGATCTGTA 


TTATCGTCTG 


CCTTGTGAAT 


120 




GATGGAGCAG GAAAACACTC AGAAGGACGA 


GAAAGGACAA 


AAACATATTC 


ACTTAACAGC 


180 




AGAGGTTACT TCAGAAAAGA AAGAGGAGCA 


CGTAGGAGCA 


AGATTCTGCT 


GGTGAATACT 


240 




AAAGGTCTTG ATGAACCCCA CATTGGGCAT 


GGTGATTTTG 


GCTTAGTAGC 


TGAACTATTT 


300 


ixxb 


GATTCCACCA GAACACATAC AAACAGAAAA 


GAGCCAGACA 


TGAACAAAGT 


CAAGCTTTTC 


360 




TCAACAGTTG CCCATGGAAA CAAAAGTGCA 


AGAAGAAAAG 


CTTACAATGG 


TTCTAGAAGG 


420 




AATATTTTTT CTCGCCGTTC TTTTGATAAA 


AGAAATACAG 


AGGTTACTGA 


AAAGCCTGGT 


480 




GCCAAGATGT TCTGGAACAA TTTTTTGGTT 


AAAATGAATG 


GAGCCCCACA 


GAATACAAGC 


540 




CATGGCAGTA AAGCACAGGA AATAATGAAA 


GAAGCTTGCA 


AAACCTTGCC 


CTTCACTCAG 


600 




AATATTGTAC ATGAAAACTG TGACAGGATG 


GTGATACAGA 


ACAATCTGTG 


CTTTGGTAAA 


660 




TGCATCTCTC TCCATGTTCC AAATCAGCAA 


GATCGACGAA 


ATACTTGTTC 


CCATTGCTTG 


720 




CCGTCCAAAT TTACCCTGAA CCACCTGACG 


CTGAATTGTA 


CTGGATCTAA 


GAATGTAGTA 


780 




AAGGTTGTCA TGATGGTAGA GGAATGCACG 


TGTGAAGCTC 


ATAAGAGCAA 


CTTCCACCAA 


840 
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ACTGCACAGT TTAACATGGA TACATCTACT ACCCTGCACC ATTAAAAGGA CTGTCTGCCA 
TACAGTATGG AAATGCCCAT TTGTTGGAAT ATTCGTTACA TGCTATGTAT CTAAAGCATT 
ATGTTGCCTT CTGTTTCATA TAACCACATG GAATAAGGAT TGTATGAATT ATAATTAACA 
AATGGCATTT TGTGTAACAT GCAAGATCTC TGTTCCATCA GTTGCAAGAT AAAAGGCAAT 
ATTTGTTTGA CTTTTTTCTA CAAAATGAAT ACCCAAATAT ATGATAAGAT AATGGGGTCA 
AAACTGTTAA GGGGTAATGT AATAATAGGG ACTAACAACC AATCAGCAGG TATGATTTAC 
TGGTCACCTG TTTAAAAGCA AACATCTTAT TGGTTGCTAT GGGTTACTGC TTCTGGGCAA 
AATGTGTGCC TCATAGGGGG GTTAGTGTGT TGTGTAGTGA ATTAATTGTA TTTATTTCAT 
TGTTACAATG AAGAGGATGT CTATGTTTAT TTCACTTTTA TTAATGTACA ATAAATGTTC 
TTGTTTCTTT AAAAAAAAAA AAAAACTCGA G 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 318 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Ser Arg Thr Arg Lys Val Asp Ser Leu Leu Leu Leu Ala lie Pro 
1 5 10 15 

Gly Leu Ala Leu Leu Leu Leu Pro Asn Ala Tyr Cys Ala Ser Cys Glu 
20 25 30 

Pro Val Arg lie Pro Met Cys Lys Ser Met Pro Trp Asn Met Thr Lys 
35 40 45 

Met Pro Asn His Leu His His Ser Thr Gin Ala Asn Ala lie Leu Ala 
50 55 60 

lie Glu Gin Phe Glu Gly Leu Leu Thr Thr Glu Cys Ser Gin Asp Leu 
65 70 75 80 

Leu Phe Phe Leu Cys Ala Met Tyr Ala Pro lie Cys Thr lie Asp Phe 

85 90 95 



Gin His Glu Pro 
100 

Ala Gly Cys Glu 
115 

Ser Leu Ala Cys 
130 

Ser Pro Glu Ala 
145 

Asp Phe Ser Met 



His Cys Lys Cys 
180 

Asn Asn Tyr Asn 
195 

Lys Cys His Asp 
210 

Ser Ser Leu Val 
225 

Ser Gly Cys Leu 



Met Gly Tyr Glu 
260 

Ser Leu Ala Glu 
275 

Trp Asp Gin Lys 
290 

Pro lie Pro Asn 
305 



lie Lys Pro 
Pro lie Leu 



Glu Glu Leu 
135 

He Val Thr 
150 

Asp Ser Asn 
165 

Lys Pro Met 



Tyr Val He 



Ala Thr Ala 
215 

Asn lie Pro 
230 

Cys Pro Gin 
245 

Asp Lys Glu 



Lys Trp Arg 



Leu Arg Arg 
295 

Lys Asn Ser 
310 



Cys Lys Ser 
105 

lie Lys Tyr 
120 

Pro Val Tyr 

val Glu Gin 

Asn Gly Asn 
170 

Lys Ala Thr 
185 

Arg Ala Lys 
200 

He Val Glu 

Lys Asp Thr 

Leu Val Ala 
250 

Arg Thr Arg 
265 

Asp Arg Leu 
280 

Pro Arg Lys 
Asn Ser Arg 



Val Cys 

Arg His 

Asp Arg 
140 

Gly Thr 
155 

Cys Gly 

Gin Lys 

Val Lys 

Val Lys 
220 

Val Thr 
235 

Asn Glu 
Leu Leu 
Ala Lys 



Ser Lys 
300 

Gin Ala 
315 



Glu Arg Ala Arg 
110 

Thr Trp Pro Glu 
125 

Gly Val Cys He 



Asp Ser Met Pro 
160 

Ser Gly Arg Glu 
175 

Thr Tyr Leu Lys 
190 

Glu Val Lys val 
205 

Glu He Leu Lys 



Leu Tyr Thr Asn 
240 

Glu Tyr He He 
255 

Leu Val Glu Gly 
270 

Lys Val Lys Arg 
285 

Asp Pro Val Ala 



Arg Ser 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1875 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 





GAATTCCCTT 


mo ft o ft ft s^o ft 

TCACACAGGA 


CTCCTGGCAG 


ft oomoft % ms^o 

AGGTGAATGG 


TTAGCCCTAT 


GGATTTGGTT 


60 




TGTTGATTTT 


o ft o ft o ft mo ft m 

GACACATGAT 


TGATTGCTTT 


si ft o ft m ft o o ft m 

CAGATAGGAT 


mo m ft o o % o mm 

TGAAGGACTT 


GGATTTTTAT 


120 




CTAATTCTGC 


ft oiniiHiimft ft ft m 

ACTTTTAAAT 


mft mOmo ft o rn ft 

TATCTGAGTA 


ft mmommOft mm 

ATTGTTCATT 


mmAmi mmooft 

TTGTATTGGA 


mooo ft omft ft ft 

TGGGACTAAA 


180 




o* mft * ft nmmK 

GATAAACTTA 


ft o mo o mmo o m 

ACTCCTTGCT 


fnmmo ft ommo o 

TTTGACTTGC 


o o ft m n % ft o fn » 

CCATAAACTA 


TAAGGTGGGG 


TGAGTTGTAG 


240 




TTGCTTTTAC 


» mo mo nnnn 

ATGTGCCCAG 


ft mmmmooomo 

ATTTTCCCTG 


m ft mmo oomom 

TATTCCCTGT 


ft ^nmooomofnit 

ATTCCCTCTA 


ft ft o m ft ft oo om 

AAGTAAGCCT 


300 




% O % O H mm o ft O 

ACACATACAG 


OI1W1IOOOO ft O ft 

GTTGGGCAGA 


ft m ft ft o ft ft mo m 

ATAACAATGT 


omo o ft ft o ft ft o 

CTCGAACAAG 


ft ft ft o m o o ft o 

GAAAGTGGAC 


moft fnmft omoo 

TCATTACTGC 


*\ r r\ 
360 




mft Oftiooooft m 

TACTGGCCAT 


ft oo mo o ft sims* 

ACCTGGACTG 


oooomrnomo m 

GCGCTTCTCT 


m ft mm ft oo o ft ft 

TATTACCCAA 


inoommft oms^m 

TGCTTACTGT 


GCTTCGTGTG 


420 


o 


AGCCTGTGCG 


o* mnnHn ft mo 

GATCCCCATG 


moo ft ft ft mom ft 

TGCAAATCTA 


moooftmooft ft 

TGCCATGGAA 


o ft mo ft oo ft ft o 

CATGACCAAG 


ft mo o si o ft ft o o 

ATGCCCAACC 


480 


«l mo moo % OO % 

ATCTCCACCA 


o ft oo ft ^mn ft ft 

CAGCACTCAA 


ooo ft ft mooo ft 

GCCAATGCCA 


moomoooft ft m 

TCCTGGCAAT 


mo ft ft o ft om^nm 

TGAACAGTTT 


Oft ft oommmoo 

GAAGGTTTGC 


540 


□ 


mO m OO ft OfT10 ft 

TGACCACTGA 


ft morn ft OOO ft O 

ATGTAGCCAG 


Oft OOrnrpmrporn 

GACCTTTTGT 


TCTTTCTGTG 


mooo ft mom ft m 

TGCCATGTAT 


^ r> ^ ^ ^ ^ m mmm 

GCCCCCATTT 


600 




omft ooftfnooft 

GTACCATCGA 


TTTCCAGCAT 


O ft ft oo ft ft mm ft 

GAACCAATTA 


ft f* f*fx\tnf* ft ft 
AGCCTTGCAA 


omo o o mo mo o 

GTCCGTGTGC 


Oft ft ft ^^^ryf* ft 

GAAAGGGCCA 


660 




GGGCCGGCTG 


mo ft ppflO ft mm 

TGAGCCCATT 


omo ft rn ft ft ft om 

C TC AT AAAG T 


ft/^^/^^OftOftO 

ACCGGCACAC 


mmo o o o ft o ft o 

TTGGCCAGAG 


ft ^^^n\^^^ ft m 

AGCCTGGCAT 


720 




omo ft 7v oft oom 

GTGAAGAGCT 


ooooornft mft m 

GCCCGTATAT 


GACAGAGGAG 


rno m o o ft mo mo 

TCTGCATCTC 


^-/"*/^ ft ft OOOm 

CCCAGAGGCT 


ft mo o mo ft o ft o 

ATCGTCACAG 


T O A 

780 


'■4 


mOO ft » Oft ft OO 

TGGAACAAGG 


ft ft o ft o ft mmo ft 

AACAGATTCA 


ft mooo ft o ft o m 

ATGCCAGACT 


mo mo o ft mo o ft 

TCTCCATGGA 


mmoft ft ft o ft ft m 

TTCAAACAAT 


ooft ft ft mmooo 

GGAAATTGCG 


840 


i - 

ssss 

Q 


oft » pnnoo ft o 

GAAGCGGCAG 


oo » omom 

GGAGCACTGT 


ft ft ft moo ft ft o o 
AAATGCAAGC 


oo ft rno ft ft ooo 

CCATGAAGGC 


AACCCAAAAG 


ft o o m ft mo mo ft 

ACGTATCTCA 


AAA 

900 




» Oft ftmK % mm it 

AGAATAATTA 


/"•ft ft mm ft mom ft 

CAATTATGTA 


ft mo ft o ft oo ft ft 

ATCAGAGCAA 


ft ft omo ft ft ft o ft 

AAGTGAAAGA 


^ ^ Ik m H Om-O 

GGTGAAAGTG 


ft ft ft ms* o o ft o s* 

AAATGCCACG 


St S" s\ 

960 




% ooo % ft ^ ft oo 
ACGCAACAGC 


ft ft mm/ '* ii is* o ft ft 

AATTGTGGAA 


omft ft ft oo ft o ft 

GTAAAGGAGA 


mmo mo ft ft omo 

TTCTCAAGTC 


fnmo^iomft ^wr\^ 

TTCCCTAGTG 


ft ft o ft mmo o m % 

AACATTCCTA 


1020 




ft ft o ft o ft o ft om 

AAGACACAGT 


o ft o ft omom jv o 

GACACTGTAC 


ft oo ft ft omo ft o 

ACCAACTCAG 


oo moo mmo mo 

GCTGCTTGTG 


^^^^^ ft oo^rcm 

CCCCCAGCTT 


o>^nm^i^iOft ft m/i 

GTTGCCAATG 


4 A A A 

1080 




AGGAATACAT 


AATTATGGGC 


TATGAAGACA 


AAGAGCGTAC 


CAGGCTTCTA 


CTAGTGGAAG 


1140 




GATCCTTGGC 


CGAAAAATGG 


AGAGATCGTC 


TTGCTAAGAA 


AGTCAAGCGC 


TGGGATCAAA 


1200 




AGCTTCGACG 


TCCCAGGAAA 


AGCAAAGACC 


CCGTGGCTCC 


AATTCCCAAC 


AAAAACAGCA 


1260 




ATTCCAGACA 


AGCGCGTAGT 


TAGACTAACG 


GAAAGGTGTA 


TGGAAACTCT 


ATGGACTTTG 


1320 




AAACTAAGAT 


TTGCATTGTT 


GGAAGAGCAA 


AAAAGAAATT 


GCACTACAGC 


ACGTTATATT 


1380 




CTATTGTTTA 


CTACAAGAAG 


CTGGTTTAGT 


TGATTGTAGT 


TCTCCTTTCC 


TTCTTTTTTT 


1440 
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TTATAACTAT ATTTGCACGT GTTCCCAGGC AATTGTTTTA TTCAACTTCC AGTGACAGAG 1500 

CAGTGACTGA ATGTCTCAGC CTAAAGAAGC TCAATTCATT TCTGATCAAC TAATGGTGAC . 1560 

AAGTGTTTGA TACTTGGGGA AAGTGAACTA ATTGCAATGG TAAATCAGAG AAAAGTTGAC 1620 

CAATGTTGCT TTTCCTGTAG ATGAACAAGT GAGAGATCAC ATTTAAATGA TGATCACTTT 1680 

CCATTTAATA CTTTCAGCAG TTTTAGTTAG ATGACATGTA GGATGCACCT AAATCTAAAT 1740 

ATTTTATCAT AAATGAAGAG CTGGTTTAGA CTGTATGGTC ACTGTTGGGA AGGTAAATGC 1800 

CTACTTTGTC AATTCTGTTT TAAAAATTGC CTAAATAAAT ATTAAGTCCT AAATAAAAAA 1860 

AAAAAAAAAA AAAAA 1875 



P , (2) INFORMATION FOR SEQ ID NO: 5: 



. 1=5 

'hi 
'ft 



■■4 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 979 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Leu Leu Leu Phe Arg Ala lie Pro Met Leu Leu Leu Gly Leu Met 
15 10 15 

Val Leu Gin Thr Asp Cys Glu lie Ala Gin Tyr Tyr lie Asp Glu Glu 
20 25 30 

Glu Pro Pro Gly Thr Val He Ala Val Leu Ser Gin His Ser He Phe 
35 40 45 

Asn Thr Thr Asp He Pro Ala Thr Asn Phe Arg Leu Met Lys Gin Phe 
50 55 60 

Asn Asn Ser Leu He Gly Val Arg Glu Ser Asp Gly Gin Leu Ser He 
65 70 75 80 

Met Glu Arg He Asp Arg Glu Gin He Cys Arg Gin Ser Leu His Cys 

85 90 95 

Asn Leu Ala Leu Asp Val Val Ser Phe Ser Lys Gly His Phe Lys Leu 
100 105 110 



[Page 7] 



Leu Asn 

Phe Pro 
130 

Gly Thr 
145 

Asn Ser 

Asp Val 

Met Arg 

Leu Ala 
210 

Asn lie 
225 

Ser Thr 

Leu Leu 

He Val 

Phe Lys 
290 

Asp Phe 
305 

Leu Gly 
Leu Asp 
Thr Val 



Asn Phe 
370 



Val Lys 
115 

Ser Glu 

Arg He 

He Gin 

Leu Thr 
180 

Glu Leu 
195 

Met Asp 

Arg val 

He Ala 

Glu Leu 
260 

Tyr Gly 
275 

He Asn 
Glu Thr 
Pro Asn 



val Glu 

He Met 

Pro Leu 
150 

Asn Phe 
165 

Arg Ala 

Asp Arg 

Gly Gly 

Leu Asp 
230 

Val Asp 
245 

His Ala 
Phe Ser 
Ser Arg 



Lys Gin 
310 

Pro Leu 
325 



val Asn 
340 



Asp Asn 
Gly Val 
He Ala Leu He 



Asn Ala 
355 



Val Arg Asp He Asn Asp His Ser Pro His 
120 125 

His Val Glu Val Ser Glu Ser Ser Ser Val 
135 140 

Glu He Ala He Asp Glu Asp Val Gly Ser 

155 160 

Gin He Ser Asn Asn Ser His Phe Ser He 
170 175 

Asp Gly Val Lys Tyr Ala Asp Leu Val Leu 
185 190 

Glu He Gin Pro Thr Tyr He Met Glu Leu 
200 205 

Val Pro Ser Leu Ser Gly Thr Ala Val Val 
215 220 

Phe Asn Asp Asn Ser Pro Val Phe Glu Arg 

235 240 

Leu Val Glu Asp Ala Pro Leu Gly Tyr Leu 
250 255 

Thr Asp Asp Asp Glu Gly Val Asn Gly Glu 
265 270 

Thr Leu Ala Ser Gin Glu Val Arg Gin Leu 
280 285 

Thr Gly Ser Val Thr Leu Glu Gly Gin Val 
295 300 

Thr Tyr Glu Phe Glu Val Gin Ala Gin Asp 

315 320 

Thr Ala Thr Cys Lys Val Thr Val His He 
330 335 

Thr Pro Ala He Thr He Thr Pro Leu Thr 
345 350 

Ala Tyr He Pro Glu Thr Ala Thr Lys Glu 
360 365 

Ser Thr Thr Asp Arg Ala Ser Gly Ser Asn 
375 380 
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Gly 
385 

Gin 



Arg 
Gly 
Asp 



Ser 
465 

Ala 



ri val 



2 Asp 



Lys 



Pro 
545 

Asn 



Gly 
Gin 
Tyr 



Ser 
625 

Asp 



Gin Val Arg Cys 
Ala Tyr 
Glu Asn 



Glu Asp 
405 



Phe Pro 
435 

Glu Asn 
450 

lie Leu 

Arg Asp 

Asp Ala 

Ala Asp 
515 

Leu Lys 
530 

Gin Leu 

Asp Asn 

Glu Val 

Leu Lys 
595 

Thr He 
610 

Gly Glu 
Leu Ser 



He Ala 
420 

Ser Leu 

Asp Asn 

Glu Asn 

Ser Asp 
485 

Lys Val 
500 

Ser Gly 

Gin Leu 

Ser Thr 

Cys Pro 
565 

Leu Leu 
580 

Ala Glu 
Leu Arg 
Val Phe 



He Val 
645 



Thr Leu Tyr 
390 

Ser Tyr Met 
Ala Tyr Ser 



Lys Thr Lys 
440 

Ala Pro Val 
455 

Asn Ala Pro 
470 



Gly His Glu His Phe Lys Leu Gin 
395 400 

He Val Thr Thr Ser Thr Leu Asp 
410 415 

Leu Thr Val Val Ala Glu Asp Leu 
425 430 

Lys Tyr Tyr Thr Val Lys Val Ser 

445 

Phe Ser Lys Pro Gin Tyr Glu Ala 
460 

Gly Ser Tyr He Thr Thr Val He 
475 480 



Ser Asp Gin Asn Gly Lys Val Asn Tyr Arg Leu 

490 495 

Met Gly Gin Ser Leu Thr Thr Phe Val Ser Leu 
505 510 



Val Leu Arg 
520 

Asp Phe Glu 
535 

Arg Val Gin 
550 



Ala Val Arg Ser Leu Asp Tyr Glu 

525 

He Glu Ala Ala Asp Asn Gly He 
540 

Leu Asn Leu Arg He Val Asp Gin 
555 560 



Val He Thr Asn Pro Leu Leu Asn Asn Gly Ser 

570 575 

Pro He Ser Ala Pro Gin Asn Tyr Leu Val Phe 
585 590 



Asp Ser Asp 
600 

Asp Pro Ser 
615 

Leu Lys Lys 
630 



Glu Gly His Asn Ser Gin Leu Phe 

605 

Arg Leu Phe Ala He Asn Lys Glu 
620 

Gin Leu Asn Ser Asp His Ser Glu 
635 640 



Val Ala Val Tyr Asp Leu Gly Arg Pro Ser Leu 

650 655 
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Ser Thr Asn Ala 
660 

Asn val Glu val 
675 

II Asp Met Ser 
690 

Leu Leu Leu Leu 
705 

Ala Gly Glu Phe 



Glu Arg Leu Leu 
740 

Ser Gin Ser Glu 
755 

Cys Ser Val Ser 
770 

His Ser lie Ser 
785 

Asn Cys Ala Met 



Thr Lys Asp Ser 
820 

Asp Thr Ser Gly 
835 

Ala Gin Ala Ser 
850 

Ala Asp Asn Tyr 
865 

Asn Cys Thr Leu 



Ala Pro Ala His 
900 

lie Pro Asn His 
915 



Thr Val Lys 

Val He Leu 

He He Phe 
695 

Ala He Phe 
710 

Lys Gin Val 
725 

Ser Thr Pro 

Ser Cys Gin 

Ser Asn Gin 
775 

Val Pro Ser 
790 

Ser He Ser 
805 

Gly Lys Gly 

Glu Ser Gin 

Ala Gin Tyr 
855 

Phe Ser His 
870 

Gin Tyr Glu 
885 

Tyr Asn Thr 
Thr Leu Arg 



Phe He Leu Thr Asp Ser Phe Pro Ser 
665 670 

Gin Pro Ser Ala Glu Glu Gin His Gin 
680 685 

He Ala val Leu Ala Gly Gly Cys Ala 

700 

Phe Val Ala Cys Thr Cys Lys Lys Lys 
715 720 

Pro Glu Gin His Gly Thr Cys Asn Glu 
730 735 

Ser Pro Gin Ser Val Ser Ser Ser Leu 
745 750 

Leu Ser He Asn Thr Glu Ser Glu Asn 
760 765 

Glu Gin His Gin Gin Thr Gly He Lys 

780 

Tyr His Thr Ser Gly Trp His Leu Asp 
795 800 

Gly His Ser His Met Gly His He Ser 
810 815 

Asp Ser Asp Phe Asn Asp Ser Asp Ser 
825 830 

Lys Lys Ser He Glu Gin Pro Met Gin 
840 845 

Thr Asp Glu Ser Ala Gly Phe Arg His 

860 

Arg He Asn Lys Gly Pro Glu Asn Gly 
875 880 

Lys Gly Tyr Arg Leu Ser Tyr Ser Val 
890 895 

Tyr His Ala Arg Met Pro Asn Leu His 
905 910 

Asp Pro Tyr Tyr His He Asn Asn Pro 
920 925 
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Val Ala Asn Arg Met His Ala Glu Tyr Glu Arg Asp Leu Val Asn Arg 
'930 935 940 

Ser Ala Thr Leu Ser Pro Gin Arg Ser Ser Ser Arg Tyr Gin Glu Phe 
945 950 955 960 

Asn Tyr Ser Pro Gin He Ser Arg Gin Leu His Pro Ser Glu He Ala 

965 970 975 

Thr Thr Phe 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3655 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 





GAATTCCCAG 


AGATGAACTC 


CTTGAGATTG 


TTTTAAATGA 


CTGCAGGTCT 


GGAAGGATTC 


60 


M 

'a. 1 


ACATTGCCAC 


ACTGTTTCTA 


GGCATGAAAA 


AACTGCAAGT 


TTCAACTTTG 


TTTTTGGTGC 


120 


■" S S 
;~™ 


AACTTTGATT 


CTTCAAGATG 


CTGCTTCTCT 


TCAGAGCCAT 


TCCAATGCTG 


CTGTTGGGAC 


180 


i-— ; 


TGATGGTTTT ACAAACAGAC 


TGTGAAATTG 


CCCAGTACTA 


CATAGATGAA 


GAAGAACCCC 


240 




CTGGCACTGT 


AATTGCAGTG 


TTGTCACAAC 


ACTCCATATT 


TAACACTACA 


GATATACCTG 


300 




CAACCAATTT 


CCGTCTAATG 


AAGCAATTTA 


ATAATTCCCT 


TATCGGAGTC 


CGTGAGAGTG 


360 




ATGGGCAGCT 


GAGCATCATG 


GAGAGGATTG 


ACCGGGAGCA 


AATCTGCAGG 


CAGTCCCTTC 


420 




ACTGCAACCT 


GGCTTTGGAT 


GTGGTCAGCT 


TTTCCAAAGG 


ACACTTCAAG 


CTTCTGAACG 


480 




TGAAAGTGGA GGTGAGAGAC ATTAATGACC 


ATAGCCCTCA 


CTTTCCCAGT 


GAAATAATGC 


540 




ATGTGGAGGT 


GTCTGAAAGT 


TCCTCTGTGG 


GCACCAGGAT 


TCCTTTAGAA 


ATTGCAATAG 


600 




ATGAAGATGT 


TGGGTCCAAC 


TCCATCCAGA 


ACTTTCAGAT 


CTCAAATAAT 


AGCCACTTCA 


660 




GCATTGATGT 


GCTAACCAGA 


GCAGATGGGG 


TGAAATATGC 


AGATTTAGTC 


TTAATGAGAG 


720 




AACTGGACAG 


GGAAATCCAG 


CCAACATACA 


TAATGGAGCT 


ACTAGCAATG 


GATGGGGGTG 
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H] 



fPJVfV* JVfHPAPT HTPTPPTAPT 
X AllAI uAL 1 Al\* X VjVjX AL X 


ppaptpptta 

VjLAlJ X VjVJ X x a 


AP ATPPP APT 
ALA X LL VjAVj X 


PPTPPAPTTT 
LL X VjVjAV* XXX 


A A TP AT A APA 
AA X VjA X AAL A 


0 A f\ 
OH U 


pnpn RPTCTT TPAPAPAAPP 
vjL LLnb 1^X1 IVjAVjAVjAAVjL 


APPATTPPTP 

ALLA X X VJL X Vj 


X VjVj/iLL X AVj X 


APAPPATPPT 
AVjAVjVjA X VjV* x 


PPTPTPPPAT 
LL X L X VJVJVjA X 


QAA 


ALLxIIIVjXI VjVjAVjI 


pptaptpapp 

VjL InL XVjALVj 


A TP ATP A APP 
A X VjA X VjAAVjVj 


APTPA ATPPA 
AVJ X Vj AA X VjVj A 


PAAATTPTTT 

VJ AAA X X Vj X X X 


jdU 


ATVjiiAirLAvj LAL 1 X IovjLA 


X L X L AAvj A vj Vj 


TAPPTP A PPT 


RfpcpfnjA 7A A & TT 
AX X XAAAAX X 


A APTPP APA A 
AAL X LLAVjAA 


1 ft o n 
1 u c u 


L TvjVjlAvj xvj 1 XAlXlXXVjAA 


VjVjLLAAo lib 


7\ mmmfpp APAP 

A X X X X bA^AL 


P1HVPP7APAPT 
UAAVjV* AVjAL, X 


TA PP A ATTTP 
XALVjAAX X XVj 


1 ft Q ft 

xu ou 


AVjVjIALAAvjL LLAAVjAI lib 


PPPPPP A APP 
VjVjLLL LAAL L 


P APTP APTPP 
LAL X unU X 


TAPTTPTA A A 
X AL X X Vj X AAA 


PTA APTPTTP 
Vj X AAL X Vj X X L 


1 1 A ft 
X X H U 


ATATAPTTPA TPTAAATPAT 
A1A1ALX XVjA X Vj 1 ririri 1 Vjri 1 


A AT APPPP A P 


PP A TP APT AT 


TAPPPPTPTP 

XrVV*V a rV*V* X V*» X VJ 


APTAPTGTAA 
nL x nw x vj x nn 


X £* V V 


arn/v^ jy/^/^K/^m TPPPTATATT 
Al VjLAVjvjAVjX XVjLLXAXAX 1 


PPAPAAAPAP 
L L AVJrLrlnL/i VJ 


PP APA A APP A 


PAAPTTTATA 
VJrvr\L X X X rl X i\ 


PPTPTPATPA 

VJL X L X un X LXl 


126ft 

X £ U v 


VjLAL X nt XoA LAvjAVjLL Itl 


PP ATPTA ATP 

VjVj A 1U1 AA X VJ 


PAPAAPTTPP 

VjALAAVjX XLVJ 


PTPTAPTPTT 
V* X Vj X nU 1L11 


TATPfiAPATCi 

X A X VlVjnLn X Vj 




AvjlAl XXX AA At X AL AVjL AA 


uLl XAlbAub 


AP APTT AP AT 
ALno X XALAX 


P ATAPTT A PP 
VjAXAVjX 1ALL 


APPTPTAPTT 
ALL X L X AL X X 


1 7ft ft 

1 JOU 


1 AvxAlAvjVjVjA AAALAIAvjLA 


P P 7A P TP T T 


TPAP APT APT 
X bnUnVj X AVj X 


TPP APA APAP 
X VjL AVj AAVjAL. 


PTTPPPTTPP 
L X X VjVjL X X L L 




LlTlAIIVjAA VjALLAAAAAvj 


TAPTA PA PAP 
X AL X hLnLno 


TP A 7APPTTAP 
X AA\jVj X X nb 


TP ATP APA AT 
X Vj A X Vj AVj AA X 


PAPAATPPAP 
VjAL AA X VjL AL 


1 Rft ft 


ptptattttp TiAftrprpac 

LlblAllllL IAAALLLLAVj 


TATPAAPPTT 
X A X VjAAVjL X X 


PTATTPTPPA 
W X ri X X V* X uon 


A A ATA ATPPT 
rvnxi X /vrV low 1 


PPAPPPTPTT 

LLnVJVJL X L X X 


1 560 


ATATAAPTAP APTPATAPPP 
AX AX AAL 1AL AVslunl n^JUL 


AGAPAPTPTCi 


ATAnTfiATPA 


AAATRGPAAA 


GTAAATTAPA 


1620 

X U fc w 


p TV PTTPTPP & TCP AAA A PTP 
uAU llo X vjVjA X VjLrii-lrirlVj 1 Vj 


ATPPGPPAGT 

t\ X VJVJvJL LflVJ X 


PAPTAAPAAP 


ATTTPTTTPT 
nxxxvjx x Xwx 


PTTGATGPGG 

L X X VJ«* X VJ W VJ VJ 


1680 

X U O V 


IPTPT^fiafiT ATTPAPAPPT 
At XLXVjVjAVjX AX XVjAVjAVjLX 


PTTAPPTPTT 
vj x x rivjvj x l x x 


TAfiAPTATfiA 


AAA.APTTAAA 


PAAPTGGATT 
lxvtvl x vjvjn x x 


1740 

X / "I v 


TTPA A ATTPA APPTPP APAP 
X XajAAA X XVjA AVjL X VjL AVj AL 


AATPPPATPP 

AA X VjVjVjA X L L 


PTPAAPTPTP 

X w AA\w 


PAPTPfiPOTT 

V#AV# X V* VJV* VJ X X 


PAAPTAAATP 
LnnL x ruixi x l 


1 800 

X O VI v 


TPAPAATAPT TPATPAA AAT 
XLAVjAAIAVjI IbAlLAAAAl 


PATAATTPPP 
VjAXAAX XVjLL 


PTPTPATA AP 
v» X Vj X Vj A X AAV~ 


TAATPPTPTT 
1/inlLL X V* X X 


PTTAATAATP 

LXXAAXAAXVj 


1 P60 

X O u U 


PPTPPPPTP IA 7A PTTPTPP TT 
VjL IVvjVjvj X VjA AvjI ILXVjLX X 


PPPATPAPPP 
L L L A X L AVj L Vj 


PTPPTP A A A A 
t ILL XLAAnn 


PT A TTT APTT 
Llnl X X AVj X X 


TTPPAPPTPA 
X XLLAVjL xla 


1 Q20 

17&V 


IRfiPPPlfiPft TTP APATPA A 
AAvjLLVjAVjVjA X XUAbAlAsAA 


PPPPAPA APT 
VjVjVjL ALAAL X 


PPP APPTPTT 
LLLAuL X Vj X X 


PTATAPPATA 
Llnl ALL A X A 


PTP APAP ATP 

L X VJ AVJ AVJ A X L 


1 Qft 0 


PAAPP PTTTPPP A TT 
LAAVjLAvjAX X bl X XVjLLAX X 


A APA A APA A A 
AALAAAVsAAn 


PTPPTP A APT 
Vj X VjVj X Vj AAVj X 


PTTPPTPAAA 
Vj X X L L X unnn 


AAAP A ATT A A 
AAALAAX X AA 


2040 

£* v *t v 


RPTPTPAPPl TTP AP APPAP 
AL XVjALLA X X l AvjAvjVjAL 


TTP APP ATA P 
X X onuwi X rl Vj 


TAPTTPPAPT 

X nVJ X X VJV>* AVj X 


PTATPAPTTP 
Vj X n X vjriL x x vj 


PPAAPAPPTT 

VJVJAAVJrlLL X X 


2100 


CATTATCCAC CAATGCTACA 


GTTAAATTCA 


TCCTCACCGA 


CTCTTTTCCT 


TCTAACGTTG 


2160 


AAGTCGTTAT TTTGCAACCA 


TCTGCAGAAG 


AGCAGCACCA 


GATCGATATG 


TCCATTATAT 


2220 


TCATTGCAGT GCTGGCTGGT 


GGTTGTGCTT 


TGCTACTTTT 


GGCCATCTTT 


TTTGTGGCCT 


2280 


GTACTTGTAA AAAGAAAGCT 


GGTGAATTTA 


AGCAGGTACC 


TGAACAACAT 


GGAACATGCA 


2340 
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TATP7k ikPTA ta pp 
A 1 \j AA\j AAL \3 


CCTGTTAAGC 


ACCCCATCTC 


ppp 7a rrnn r*f*w 
LLLAblLutjl 


P TPTTPTTPT 


TTPTP TP TA PT 

1 IblLlLAvvl 


Oil A A 

2400 


LIGAGiLAlb 


CCAACTCTCC 


ATCAATACTG 


7k Ik TP TP Ik P Ik Ik 

AA 1L1 bAvaAA 


TTPP 7k PPP TP 

1 IbLAvjLLlb 


TPPTPTIk IkPP 

TLLiLlAALL 


O A C f\ 

2460 


a a p ta /v lk pp & 
AAviAtiLAViLA 


TCAGCAAACA GGCATAAAGC 


7i P TPP Ik TPTP 

AL I LLAi LI L 


TPT7A PP 7ATPT 
IvjIALLAILI 


TIk TP 7k P Ik P 7k T 

1 A 1 L AL AL A 1 


2520 


PTPPTTPPPTA 

CTGGiiwjLA 


CCTGGACAAT 


TGTGCAATGA 


PP Ik T lk 7k PTPP 

GCATAAGTGG 


Ik P 7k TTPTP Ik P 

AC ATTL l L AL 


X TP PPP P TA P Ik 

ATGGGGCALA 


2580 


fprniv OfPTi f^Ik Ik 7k 

TTAGTALAAA 


GGACAGTGGC 


AAAGGAGATA 


GTGACTTCAA 


TP 7k P 7k PTP Ik P 

TGACAGTGAL 


TPTP 7k TIk PT 7k 

TLTGAi ALTA 


O C A A 

2640 


PTPPTaPTATaTP 

\j IvjvjAvjAAI L 


ACAAAAGAAG AGCATTGAGC 


AGCCAATGCA 


PPPTkPTk Ik PPP 

GGGALAAVjLL 


taptpptp ta ta t 
Pi\j 1 UL 1 LAA1 


2/00 


7k p ik p ik p ik tp ik 
ACACAviA 1 viA 


ATCAGCAGGG 


TTCCGACATG 


CCGATAALTA 


TTTCAGLLAL 


PP Ik Ik TP TA TA P TA 

LVjAA 1 L AAL A 


0 "7 C A 

2/60 


TA /V^PIV^P 7A P Ik 
AbrVjij iLLAliA 


AAATGGGAAC 


TGCACATTGC 


7A Ik T Ik TP Ik Ik 7A Ik 

AA 1 A 1 Vj AAAA 


PPPPT7AT7AP7A 

VjIjVjL 1 A 1 AVj A 


PTPTPTTTAPT 
L 1 Vj 1 L 1 1 AL 1 


Z o z u 


PTPT7A PPTPP 

L Ivj X AviL ILL 


TGCTCATTAC 


AATACCTACC 


7A TPP TA TA P TA 7k T 

A 1 VjLAAbAA 1 


PPPT7A TaPPTP 

VjLL 1AALL Ikj 


PTAPTaTTAPPPTa 
LALA1 ALLVjA 


I 0 0 u 


7A P P Ta T Ik P P P T 
ALLA 1 AtLL 1 


TAGAGACCCT 


TATTACCATA 


TP 7a 7a T7A 7a TPP 


TPTTPPT7A TAT 
loll bL 1 AA 1 


PPP Ti TPP Tl PP 


0 0 A ft 


PPPlk TV TIk TP 7k 
LlroAAlAlliA 


AAGAGATTTA 


GTCAACAGAA 


P TPP Ik 7A PPTT 

VjKjLAALIjI 1 


ta fpn^vnnnn ta p 
AIL ILLVjLAkj 


TlPTA TPPTPTTa 
A\j A 1 L \3 1 L 1 A 


ft ft n 


PP Ik P Ik TTkPP Ik 

GCA(iAiACCA 


AGAATTCAAT 


TACAGTCCGC 


Ik P Ik T Ik TP Ik Ik P 

AGATATCAAG 


Ik P 7k PPTTP Ik T 

AC AVjL TTL A 1 


PPTTP TkPTA TA TA 

LL 1 1LAI?AAA 


^ ft a ft 


mmpp m ia p ik ik p 
TTVjC T ACAAC 


PTTTTTl 71 TP 71 
CI 1 1 Xnnltn 




7k Ik PTP Ik P Ik Ik T 

AAGTGAGAAT 


PP TA P TA TA TA PPP 

GLALAAAGCjL 


TA TaPTPPTTTTA 

AAvj 1 kjL 1 1 1 A 




PP 7a TP 7A 7k 7k PP 

tiLAlvjAAAljL 


TAAATATATG 


GAGTCTCCCC 


1 1 lUtLitib 


TA TPP TA TP PPP 

A 1 v>v?A 1 obuu 


PPTlPTlPTlPTlP 


7 1 R ft 


ft 7A P 7a PTfV & T 


AAATATACAG 


CTGCTTTCTA 


TTTf2P 71 TTTP 
1 1 lotnl 1 


71 PTTPPPTl TA T 


TTTTTflTTTT 
lllllVjllll 


724 0 


flMIIII II1I7A P7AT71 T 

111 InUAlnl 


TTATTTTTCC 


TGAATTGAAT 


f^T^TAPTlTTflT 
\3 1 u/itn 1 1 Vj 1 


PPTCTPTlPPT 
LL 1 1 L AUL 1 


TlTlPTTlPpTlTlT 
Ant 1 A^3l* AA 1 


77ftft 


T 71 7a 71 TPP 7A P 7a 
1 AAA IVUiUi 


GACCTACAGT 


CAAATATTTG 


AkjVjkjLLLL 1 VJ 


Ta Ta TIP 71 PP TIP 7a 
AAALAIjLALA 


TP Tl flTP 71 PP Ti 
1 L AVj 1 L Ao<a A 


77fi ft 


P P T 7A 7k 71 PTPP 
1 AAAo 1\jVj 


CCTTTTTACT 


TTTAGCAGCT 


PPTPPPTPTP 
LL 1 1 L 1 \j 


PPPTPTPTPT 
LLL 1L IVjUjI 


TTA Tl WH^r^HHH 
IAAILAIjLLL 




CTGGTCAAGT 


CCTGAGTAGG 


ATCATGGCGT 


TTTTATATGC 


ATCTCACCTA 


CTTTGGACGT 


3480 


GATTTACACA 


TAATAGGAAA 


CGCTTGGTTT 


CAGTGAAGTC 


TGTGTTGTAT 


ATATTCTGTT 


3540 


ATATACACGC 


ATTTTGTGTT 


TGTGTATATA 


TTTCAAGTCC 


ATTCAGATAT 


GTGTATATAG 


3600 


TGCAGACCTT 


GTAAATTAAA 


TATTCTGATA 


CTTTTTCCTC 


AATAAATATT 


TAAAT 


3655 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 323 amino acids 

(B) TYPE: amino acid 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Val Cys Cys Gly Pro Gly Arg Met Leu Leu Gly Trp Ala Gly Leu 
1 5 10 15 

Leu Val Leu Ala Ala Leu Cys Leu Leu Gin Val Pro Gly Ala Gin Ala 
20 25 30 

Ala Ala Cys Glu Pro Val Arg lie Pro Leu Cys Lys Ser Leu Pro Trp 
35 40 45 

Asn Met Thr Lys Met Pro Asn His Leu His His Ser Thr Gin Ala Asn 
50 55 60 

Ala lie Leu Ala Met Glu Gin Phe Glu Gly Leu Leu Gly Thr His Cys 
q 65 70 75 80 

Ser Pro Asp Leu Leu Phe Phe Leu Cys Ala Met Tyr Ala Pro lie Cys 
£ 85 90 95 

!.! 

71 Thr lie Asp Phe Gin His Glu Pro lie Lys Pro Cys Lys Ser Val Cys 
T 100 105 110 

Glu Arg Ala Arg Gin Gly Cys Glu Pro lie Leu lie Lys Tyr Arg His 
115 120 125 



1=5 

E = 



Ser Trp Pro Glu Ser Leu Ala Cys Asp Glu Leu Pro Val Tyr Asp Arg 
130 135 140 

Gly Val Cys lie Ser Pro Glu Ala lie Val Thr Ala Asp Gly Ala Asp 
145 150 155 160 

Phe Pro Met Asp Ser Ser Thr Gly His Cys Arg Gly Ala Ser Ser Glu 

165 170 175 

Arg Cys Lys Cys Lys Pro Val Arg Ala Thr Glh Lys Thr Tyr Phe Arg 
180 185 190 

Asn Asn Tyr Asn Tyr Val lie Arg Ala Lys Val Lys Glu Val Lys Met 
195 200 205 

Lys Cys His Asp Val Thr Ala Val Val Glu Val Lys Glu lie Leu Lys 
210 215 220 

Ala Ser Leu Val Asn lie Pro Arg Asp Thr Val Asn Leu Tyr Thr Thr 
225 230 235 240 



[Page 14] 



Ser Glv Cys Leu Cys Pro Pro Leu Thr Val Asn Glu Glu Tyr Val lie 

245 250 255 

Met Gly Tyr Glu Asp Glu Glu Arg Ser Arg Leu Leu Leu Val Glu Gly 
260 265 270 

Ser He Ala Glu Lys Trp Lys Asp Arg Leu Gly Lys Lys Val Lys Arg 
275 280 285 

Trp Asp Met Lys Leu Arg His Leu Gly Leu Gly Lys Thr Asp Ala Ser 
290 295 300 

Asp Ser Thr Gin Asn Gin Lys Ser Gly Arg Asn Ser Asn Pro Arg Pro 
305 310 315 320 

Ala Arg Ser 



(2) INFORMATION FOR SEQ ID NO: 8: 



(i) SEQUENCE CHARACTERISTICS: 
!*? (A) LENGTH: 2176 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



.[=& 

i: 



Q 
''"4 



(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

AAGCCTGGGA CCATGGTCTG CTGCGGCCCG GGACGGATGC TGCTAGGATG GGCCGGGTTG 60 

CTAGTCCTGG CTGCTCTCTG CCTGCTCCAG GTGCCCGGAG CTCAGGCTGC AGCCTGTGAG 120 

CCTGTCCGCA TCCCGCTGTG CAAGTCCCTT CCCTGGAACA TGACCAAGAT GCCCAACCAC 180 

CTGCACCACA GCACCCAGGC TAACGCCATC CTGGCCATGG AACAGTTCGA AGGGCTGCTG 240 

GGCACCCACT GCAGCCCGGA TCTTCTCTTC TTCCTCTGTG CAATGTACGC ACCCATTTGC 300 

ACCATCGACT TCCAGCACGA GCCCATCAAG CCCTGCAAGT CTGTGTGTGA GCGCGCCCGA 360 

CAGGGCTGCG AGCCCATTCT CATCAAGTAC CGCCACTCGT GGCCGGAAAG CTTGGCCTGC 420 

GACGAGCTGC CGGTGTACGA CCGCGGCGTG TGCATCTCTC CTGAGGCCAT CGTCACCGCG 480 

GACGGAGCGG ATTTTCCTAT GGATTCAAGT ACTGGACACT GCAGAGGGGC . AAGCAGCGAA 540 

CGTTGCAAAT GTAAGCCTGT CAGAGCTACA CAGAAGACCT ATTTCCGGAA CAATTACAAC 600 
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# * 





TATGTCATCC 


GGGCTAAAGT 


TAAAGAGGTA 


AAGATGAAAT 


GTCATGATGT 


GACCGCCGTT 


660 




GTGGAAGTGA 


AGGAAATTCT 


AAAGGCATCA 


CTGGTAAACA 


TTCCAAGGGA 


CACCGTCAAT 


720 




CTTTATACCA 


CCTCTGGCTG 


CCTCTGTCCT 


CCACTTACTG 


TCAATGAGGA 


ATATGTCATC 


780 




ATGGGCTATG 


AAGACGAGGA 


ACGTTCCAGG 


TTACTCTTGG 


TAGAAGGCTC 


TATAGCTGAG 


840 




AAGTGGAAGG 


ATCGGCTTGG 


TAAGAAAGTC 


AAGCGCTGGG 


ATATGAAACT 


CCGACACCTT 


900 




GGACTGGGTA 


AAACTGATGC 


TAGCGATTCC 


ACTCAGAATC 


AGAAGTCTGG 


CAGGAACTCT 


960 




AATCCCCGGC 


CAGCACGCAG 


CTAAATCCTG 


AAATGTAAAA 


GGCCACACCC 


ACGGACTCCC 


1020 




TTCTAAGACT 


GGCGCTGGTG 


GACTAACAAA 


GGAAAACCGC 


ACAGTTGTGC 


TCGTGACCGA 


1080 




TTGTTTACCG 


CAGACACCGC 


GTGGCTACCG 


AAGTTACTTC 


CGGTCCCCTT 


TCTCCTGCTT 


1140 




CTTAATGGCG 


TGGGGTTAGA 


TCCTTTAATA 


TGTTATATAT 


TCTGTTTCAT 


CAATCACGTG 


1200 


Q 


GGGACTGTTC 


TTTTGCAACC 


AGAATAGTAA 


ATTAAATATG 


TTGATGCTAA 


GGTTTCTGTA 


1260 




CTGGACTCCC 


TGGGTTTAAT 


TTGGTGTTCT 


GTACCCTGAT 


TGAGAATGCA 


ATGTTTCATG 


1320 


S - 

I I 

1=3 


TAAAGAGAGA 


ATCCTGGTCA 


TATCTCAAGA 


ACTAGATATT 


GCTGTAAGAC 


AGCCTCTGCT 


1380 




GCTGCGCTTA 


TAGTCTTGTG 


TTTGTATGCC 


TTTGTCCATT 


TCCCTCATGC 


TGTGAAAGTT 


1440 


I*" 

i: 


ATACATGTTT 


ATAAAGGTAG 


AACGGCATTT 


TGAAATCAGA 


CACTGCACAA 


GCAGAGTAGC 


1500 




CCAACACCAG 


GAAGCATTTA 


TGAGGAAACG 


CCACACAGCA 


TGACTTATTT 


TCAAGATTGG 


1560 


}=& 


CAGGCAGCAA 


AATAAATAGT 


GTTGGGAGCC 


AAGAAAAGAA 


TATTTTGCCT 


GGTTAAGGGG 


1620 


Q 


CACACTGGAA 


TCAGTAGCCC 


TTGAGCCATT 


AACAGCAGTG 


TTCTTCTGGC 


AAGTTTTTGA 


1680 




TTTGTTCATA 


AATGTATTCA 


CGAGCATTAG 


AGATGAACTT 


ATAACTAGAC 


ATCTGTTGTT 


1740 




ATCTCTATAG 


CTCTGCTTCC 


TTCTAAATCA 


AACCCATTGT 


TGGATGCTCC 


CTCTCCATTC 


1800 




ATAAATAAAT 


TTGGCTTGCT 


GTATTGGCCA 


GGAAAAGAAA 


GTATTAAAGT 


ATGCATGCAT 


1860 






GTGTTATTTA 


ACAGAviVa X A I 


GTAACTCTAT 


AAAAG AC TAT 


AATTTACAGG 


1920 




ACACGGAAAT 


GTGCACATTT 


GTTTACTTTT 


TTTCTTCCTT 


TTGCTTTGGG 


CTTGTGATTT 


1980 




TGGTTTTTGG 


TGTGTTTATG 


TCTGTATTTT 


GGGGGGTGGG 


TAGGTTTAAG 


CCATTGCACA 


2040 




TTCAAGTTGA 


ACTAGATTAG 


AGTAGACTAG 


GCTCATTGGC 


CTAGACATTA 


TGATTTGAAT 


2100 
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TTGTGTTGTT TAATGCTCCA TCAAGATGTC TAATAAAAGG AATATGGTTG TCAACAGAGA 2160 
CGACAACAAC AACAAA 2176 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 325 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Val Cys Gly Ser Pro Gly Gly Met Leu Leu Leu Arg Ala Gly Leu 
15 10 15 

Leu Ala Leu Ala Ala Leu Cys Leu Leu Arg val Pro Gly Ala Arg Ala 
20 25 30 

Ala Ala Cys Glu Pro Val Arg lie Pro Leu Cys Lys Ser Leu Pro Trp 
35 40 45 

Asn Met Thr Lys Met Pro Asn His Leu His His Ser Thr Gin Ala Asn 
50 55 60 

Ala lie Leu Ala He Glu Gin Phe Glu Gly Leu Leu Gly Thr His Cys 
65 70 75 80 

Ser Pro Asp Leu Leu Phe Phe Leu Cys Ala Met Tyr Ala Pro He Cys 

85 90 95 

Thr He Asp Phe Gin His Glu Pro He Lys Pro Cys Lys Ser Val Cys 
100 105 110 

Glu Arg Ala Arg Gin Gly Cys Glu Pro He Leu He Lys Tyr Arg His 
115 120 125 

Ser Trp Pro Glu Asn Leu Ala Cys Glu Glu Leu Pro Val Tyr Asp Arg 
130 135 140 

Gly Val Cys He Ser Pro Glu Ala He Val Thr Ala Asp Gly Ala Asp 
145 150 155 160 

Phe Pro Met Asp Ser Ser Asn Gly Asn Cys Arg Gly Ala Ser Ser Glu 

165 170 175 
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Arg Cys Lys Cys Lys Pro He Arg Ala Thr Gin Lys Thr Tyr Phe Arg 
180 185 190 

Asn Asn Tyr Asn Tyr Val He Arg Ala Lys Val Lys Glu He Lys Thr 
195 200 205 

Lys Cys His Asp Val Thr Ala Val Val Glu Val Lys Glu He Leu Lys 
210 215 220 

Ser Ser Leu Val Asn He Pro Arg Asp Thr Val Asn Leu Tyr Thr Ser 
225 230 235 240 

Ser Gly Cys Leu Cys Pro Pro Leu Asn Val Asn Glu Glu Tyr He He 

245 250 255 

Met Gly Tyr Glu Asp Glu Glu Arg Ser Arg Leu Leu Leu Val Glu Gly 
260 265 270 

Ser He Ala Glu Lys Trp Lys Asp Arg Leu Gly Lys Lys Val Lys Arg 
275 280 285 

Arg His Leu Gly Leu Ser Lys Ser Asp Ser Ser 
295 300 

Gin Ser Gin Lys Ser Gly Arg Asn Ser Asn Pro 
310 315 320 



if? 


Trp 


Asp 


Met 


Lys 


Leu 






290 








W 


Asn 


Ser 


Asp 


Ser 


Thr 


*S- : 


305 










! r. 


Arg 


Gin 


Ala 


Arg 


Asn 


f! 








325 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1893 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GGCGGAGCGG GCCTTTTGGC GTCCACTGCG CGGCTGCACC CTGCCCCATC TGCCGGGATC 60 

ATGGTCTGCG GCAGCCCGGG AGGGATGCTG CTGCTGCGGG CCGGGCTGCT TGCCCTGGCT 120 

GCTCTCTGCC TGCTCCGGGT GCCCGGGGCT CGGGCTGCAG CCTGTGAGCC CGTCCGCATC 180 

CCCCTGTGCA AGTCCCTGCC CTGGAACATG ACTAAGATGC CCAACCACCT GCACCACAGC 240 
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4> 





ACTCAGGCCA 


nvNJvwl X WW X 


GGPPATPCAC 


PACTTPCA AC 
vA\j X X woaao 


CTPTCPTCCC 

VJX w X\9w X VsVjV? 


W AW WW AW 1 ow 


oUQ 




AGCCCCGATC 


TGCTCTTPTT 

X WW X W X X W X X 


CPTPTGTGPP 

WW X W X \Jl www 


ATCT APCPCP 


PPATPTCPAP 

ww AX w X w w Aw 


P A TTP A P TTP 
w A X X OAW X X w 


J oil 




CAGCACGAGC 


CCATCAAGCC 

w w*» x vruiww w 


PTGTAAGTPT 

vlVJl HAw X W X 


O X O X wwVAww 


CCCPPPCCPA 
VjVjVjwwwVJwwA 


ooow X o X OAO 


A O A 
H Z U 




CCCATACTCA 

wwwxxxffcw awi 


TCAAGTACPG 

x waao x aw wo 


PPAPTPCTCC 
wwAw iwuluu 


PPCCACA APP 
w w ooaoaaw w 


TCCPPTCPCA 
X VjVjww X VjwVja 


OOAOw X OWWA 


A O A 

*k o U 




GTGTAPGACA 


GGGGPCTCTC 

wwwwWOX OXVJ 


P A TPTPTPP P 
wa Ivlv 1 ww w 


P APPPP ATPP 


X X Aw X VjwVjvja 


WOO AO W X OA X 


C /I A 




TTTCCTATGG 


ATTCTAGTAA 

**. x x w x ao x nn 


PGCAAAPTCT 
vwVinnnv X O X 


ACACCCCP A A 
AO AO o o o w nn 


CP ACTCA A PC 

V> W AO X VJAAW W 


PTCTA A ATCT 
w XoX AAaXOX 


Cf\(\ 

ouu 




AAGCCTATTA 


GAGCTACACA 

w«nww xnwAWA 


GAAGAPPTAT 

unnvjnvv X A X 


TTPPCCAAPA 


ATTAPA APT A 

AX X AwAAw X A 


TCTPATTPCC 

X O X w A X X wOO 


A C A 
0 0 U 




GCTAAAGTTA 


AAGAGATAAA 


GACTAAGTGP 

w«*w x aa o x o w 


PATGATGTGA 

vA X Un X O X On 


PTGPACTACT 

W X w W AV7 X Aw X 


CCACCTCAAC 

OO AOO X O AAO 






GAGATTCTAA 


AGTCCTCTCT 

**W X WW X W X W X 


GGTAAAPATT 

ww x nnnun x x 


PPAPGGGAPA 

wwAvVSwvAwA 


PTCTPAAPPT 

W X O X W AAW W X 


PTATAPPACP 

W XAX AWwAOw 


/ o u 


□ 


TCTGGCTGCC 

X W A WW X \rfW W 


TCTGCCCTCC 

x w X wWWW X WW 


APTTAATGTT 

A W X X flA X O X X 


AATGAGGAAT 
nn x vjanjoaa x 


ATATPATPAT 

AxAXwAXWAx 


CCCPTATCAA 

OOOW XAX O AA 


ft An 


GATGAGGAAC 

W«» A \?**\7w*m»W 


GTTCCAGATT 

w x x vvnun x x 


AP TP TTGGT G 

Av X W X X ww X O 


GAAGGPTPTA 

VJAAOUv X W X A 


TACPTCACAA 

X AO w X O AO AA 


CTCCAACCAT 

O X OOAAOOA X 


Qon 

7Uv 


kB 
13 


CGACTCGGTA 

w wnw x www x a 


AAAAAGTTAA 
nnnnnvji xaa 


GPGPTGGGAT 

OwOw XOOOAX 


ATGAACPTTP 
a X unnVju X X w 


CTPATPTTCC 

O X w A X w X X OO 


APTPACTAAA 

AW X W AO X AAA 


7 O U 




AGTGATTCTA 


GP AATAGTGA 
vvnninvj X OA 


TTPPAPTPAC 

X X wwAW IwAU 


ACTPACA ACT 

AO X W AUAAO X 


PTCCPACCA A 

w X OOW AOOAA 


w X woaaUwww 




*™ 


CGGCAAGCAC 


GCAACTAAAT 

wwm w x aaa x 


PPPGAAATAP 

wwvUAnA X AW 


AAAAAGTAAP 
nnAnnu X AAw 


APACTCCAPT 

Aw AO X OOAw X 


TPPTATTAAC 
XwwXaX Xaao 


X U 0 u 


J! 

I 3 ! 


ACTTACTTGC 


ATTGCTGGAC 

** x x ww x oo a w 


TAGCAAAGGA 

x a ow aaaoo A 


AAATTGPAPT 

fUUl X X Ow Aw X 


ATTCPAPATP 

AX XOwAwAX w 


ATATTPTATT 
AXAX XwXaX X 


1140 

114U 




GTTTACTATA 


AAAATCATGT 

******** X X W^ X 


GATAACTGAT 

w«* x nnw x W4* x 


TATTAPTTPT 

X A X X AW X X W X 


GTTTPTPTTT 

OXX XWXWX X X 


TCCTTTPTCP 
X OO X X X w x o w 


1 200 

It Vv 




TTCTCTCTTC 


TCTCAACCCC 


TTTGTAATGG 


TTTGGGGGCA 

xxx ww ww w wn 


GACTCTTAAG 

w«* w x w x x nnu 


TATATTGTGA 

X A X A X X O X OA 


1260 

X X> w W 




GTTTTCTATT 


TCACTAATCA 

x wnw x nn x w«* 


TGAGAAAAAC 

X wAwXlfUW^W 


TGTTPTTTTG 

XwXXWXXXXw 


PAATAATAAT 

W AAX AA X AA X 


AAATTAAAPA 

AAAX lAAAwA 


1^20 




TGCTGTTACC 

x ww x w x x aww 


AGAGPPTPTT 

AwAUwv X W X X 


TGPTGAGTPT 

X ww X OAO X W X 


PPAGATGTTA 

WW Aw A X w X X A 


ATTT APTTTf 1 
AX X X aw X X X w 


TCPAPPPPA A 
XOwAWWWWAA 


1 ^fiO 

1 JO v 




TTGGGAATCP 


AATATTCCAT 
aax AX ivjwix 


CAAAACACAC 
oaaaaoaoao 


CTTTPTCCTli 

wlllvi ww X A 


TTP A P & P & A 21 

X X w A W AO AAA 


pp rn » p * rn » rpp 
o W X AO A X A X o 


1 A A 0 
X 4 4 U 




CCTTAAAAPA 

ww x xaaaawa 


TAPTPTCPPC 
X Aw Ivl OwwO 


ATPTAATTAP 

AlwlAAl Xaw 


ACPPTTATTT 
AOww X X AX X X 


TTCT A TCP P T 
X X o X A X ow w X 


TTTPPPP ATT 
XXX ooow a X X 


1^00 




CTCCTCATGC 

w * ww x wnx ww 


TTAGAAAGTT 


PP A A ATCTTT 

vvAAfl X O X X X 


ATA A ACCTA A 
A X AnnVjvj X An 


AATCCPACTT 
aaXoowaoX X 


TCA ACTP A A A 

X OAAO X W AAA 


1 *\^0 




TGTCACATAG 


GCAAAGCAAT 


CAAGCACCAG 


GAAGTGTTTA 


TGAGGAAACA 


ACACCCAAGA 


1620 




TGAATTATTT 


TTGAGACTGT 


CAGGAAGTAA 


AATAAATAGG 


AGCTTAAGAA 


AGAACATTTT 


1680 




GCCTGATTGA 


GAAGCACAAC 


TGAAACCAGT 


AGCCGCTGGG 


GTGTTAATGG 


TAGCATTCTT • 


1740 




CTTTTGGCAA 


TACATTTGAT 


TTGTTCATGA 


ATATATTAAT 


CAGCATTAGA 


GAAATGAATT 


1800 
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ATAACTAGAC ATCTGCTGTT ATCACCATAG TTTTGTTTAA TTTGCTTCCT TTTAAATAAA I860 
CCCATTGGTG AAA^TCAAAA AAAAAAAAAA AAA 1893 
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